Learning Objectives

After completing this lesson, you’ll be able to:

Instructions

In this lesson, you will:

Archive File Handling

FME readers and writers can work with compressed, archived files of various formats. In addition to size reduction, these file types are a convenient way to store datasets that need to be handled as a single unit, for example, a set of multiple dataset files contained within a single zip file.

Archive File Reading

FME can read the following archive formats:

The dataset a reader reads is defined by the Source Dataset/Files parameter in the Navigator window:

Reading a zip file

This dataset parameter can point to an archived file, as shown in the screenshot above. You select the archive file in the source parameter, and FME will extract the data when it reads it.

This technique works whether the archived dataset is file-based (such as a single AutoCAD file) or folder-based (such as the set of files that make up a Shapefile dataset).

Because FME supports reading archived files, you might notice the default file filter includes archive formats when browsing for a reader dataset:

File browser includes archive formats

Note

You can use wildcards when reading archive files, just as with regular files. See the documentation for examples.

Zip File Writing

Writing data as a zip file is particularly useful when the output data needs to be post-processed. For example, if you use a shutdown script to move or copy output data to a new location, handling a single archive file is more convenient than multiple data files.

The simplest way to create a zipped output is to change the file extension in the output dataset field:

Setting a path with a zip ending

You can also specify the filename to write to the archive file. A shortcut button does this for you:

Using the zip shortcut button

Notice the small icon to the left of the dataset field that indicates whether the dataset is zipped.

When the workspace is run, the log file reports the file creation at various points:

MULTI_WRITER: Output will be zipped 
Zipping the contents of the temporary dataset
Finished updating output zip file: `C:\FMEData\Output\Parks.zip'

...and the output is, indeed, a zip file:

Output Parks.zip

Note

Some users may want to archive data as a single entity to move or copy it to a different location. You can use a combination of user parameters and TCL or Python shutdown scripts to find the file you just wrote and move/upload it. Alternatively, you can use the FeatureWriter transformer, which provides the dataset's path as an attribute, allowing you to do whatever you want with that path using FME transformers.

Tips and Tricks: Reading Files in ZIP Archives

Some file formats like .docx and .xlsx are actually just ZIP files. But FME won't recognize them as such. What are your options for reading files like that?

  1. The most obvious solution is to use a built-in FME reader. We have readers for Microsoft Word and Excel, so you can just read those directly.
  2. But what if FME doesn't support this format or you need to access a file inside the ZIP? Try something like this:
    1. Use a TempPathnameCreator and FeatureWriter with the format set to File Copy to rename your target file to a temporary ZIP file.
    2. Then use a FeatureReader to read the temporary ZIP, including files inside it.
    3. If you still can't see the files, use a command line utility and a SystemCaller to extract the files first.
    4. Don't know the names of the files in advance? Use a Directory and Pathname in a FeatureReader to scan the new files and read just the files you need.

Thanks to FME Community user ctredinnick for this advice.